Adaptive Cross-Modal Embeddings for Image-Text Alignment

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-modal Retrieval by Text and Image Feature Biclustering

We describe our approach to the ImageCLEF-Photo 2007 task. The novelty of our method consists of biclustering image segments and annotation words. Given the query words, we may select the image segment clusters that have strongest cooccurrence with the corresponding word clusters. These image segment clusters act as the selected segments relevant to a query. We rank text hits by our own tf.idf ...

متن کامل

Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems

In query-by-semantic-example image retrieval, images are ranked by similarity of semantic descriptors. These descriptors are obtained by classifying each image with respect to a pre-defined vocabulary of semantic concepts. In this work, we consider the problem of improving the accuracy of semantic descriptors through cross-modal regularization, based on auxiliary text. A cross-modal regularizer...

متن کامل

Adaptive Color-Image Embeddings for Database Navigation

Proceedings of the 1998 IEEE Asian Conference on Computer Vision, Hong Kong We present a novel approach to the problem of navigating through a database of color images for the purpose of image retrieval. We endow the database with a metric for the color distributions of the images. We then use multi-dimensional scaling techniques to embed a group of images as points in a two-dimensional Euclide...

متن کامل

Cross-modal Embeddings for Video and Audio Retrieval

The increasing amount of online videos brings several opportunities for training self-supervised neural networks. The creation of large scale datasets of videos such as the YouTube8M allows us to deal with this large amount of data in manageable way. In this work, we find new ways of exploiting this dataset by taking advantage of the multi-modal information it provides. By means of a neural net...

متن کامل

Learning Deep Semantic Embeddings for Cross-Modal Retrieval

Deep learning methods have been actively researched for cross-modal retrieval, with the softmax cross-entropy loss commonly applied for supervised learning. However, the softmax cross-entropy loss is known to result in large intra-class variances, which is not not very suited for cross-modal matching. In this paper, a deep architecture called Deep Semantic Embedding (DSE) is proposed, which is ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence

سال: 2020

ISSN: 2374-3468,2159-5399

DOI: 10.1609/aaai.v34i07.6915